AI[F75,JMC] - www.SailDart.org

perm filename AI[F75,JMC] blob sn#179032 filedate 1975-09-25 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	General article on AI
C00021 ENDMK
C⊗;
General article on AI

@. The problem of general intelligence.

@ Heuristic search -product spaces

@. Motivational structures.

@. Epistemology, metaphysics and heuristics.

Motivational structures

	The problem of motivational structure arises in artificial intelligence
in two ways.  First, we must decide what kind of motivational structure to
equip our robot with.  It must be general enough so that it can be used to
deal with complexes of subgoals as well as with the overall goal.

	Secondly, the robot must deal with humans and other artificial systems
and it must make hypotheses about their motivational structures in order
to know what they are likely to do, and if the robot is to work for the
welfare of one or more humans, it must do what they would want or what it
thinks is good for them.

	We shall begin with a typology of motivational structures.

	1. The simplest case is stimulus-response.  The system gives
responses to stimuli according to a certain rule.  Assume that the rule is
formulated most simply without reference to optimization or any other
kind of motivational structure.  In this case, the system has a trivial
motivational structure, and it may not be appropriate to use motivational
terminology such as goals in reasoning about the behavior of the system.
While humans have much more complex motivational structures, in some situations
they can be regarded as S-R systems and even prefer to be regarded that
way.  For example, a busy hot dog vendor prefers to be regarded as a machine
that when given money dispenses hot dogs.  He really doesn't want his customers
thinking about his goals as long as the situation remains simple and he
remains busy.

	2. The next important case is the simple optimizer.  The system acts
in a certain situation and events occur and after a definite time, the
episode terminates in a situation that is evaluated numerically.  The problem
is to choose the strategy that maximizes the expected value of the evaluation
function.  There are important special cases of this: 1. there is a goal
predicate that determines whether the goal is achieved, and the evaluation
takes the values 1 or 0 according to whether the goal is achieved or not.
2. The second special case is when no probabilities are involved.
Rather than maximize an expectation, the system may wish to minimax the
final value.  In real life, true minimax doesn't occur, because human life
is subject to disasters that are ordinarily not allowed for.  Thus we may
minimax a picnic plan by having an alternative in case of rain, but true
minimax would involve maximizing  the case of the worst possible eventuality
which might be nuclear war.

	3. The next case often considered is opitimization over time.
The value to be optimized is some kind of sum or integral of values
of future situations.  Usually some kind of discount of the future is
assumed so that a goodie tomorrow is not assigned as high a value
as a goodie today.  In most cases, the value at a given time is also
a sum of subvalues assigned to aspects of the situation.  This kind of
optimization is often appropriate for business firms and is often assumed
as guiding the behavior of a %2rational%1 man.  Besides its a priori
plausibility, this kind of evaluation is the beneficiary of some limit
theorems.  For example suppose that our robot is going to engage in a
sequence of small financial transactions each of which will gain or lose
money and that its eventual wealth will be the some of the gains of the
separate transactions, and suppose that its evaluation of its eventual
wealth is merely monotonic in the wealth.  Then we can probably show
under rather weak assumptions that it should behave with regard to
each individual transaction as though it were maximizing expected gain,
i.e. a robot that behaves that way will with probability near one have
more money than a robot that behaves some other way.
A similar result regarding the expected value of the logarithm of
ratio of wealth after and before a transactions can probably be shown
when the transactions are short term investments that pay of in proportion
to the amount of money invested.

	However, the assumption that this kind of optimization can be used
to describe very much of actual human behavior is mistaken as we will try
to show.

	4. The simplest way to expound my view of actual human motivational
structure is to recall that we are evolved from animals and to start
with their motivational structures.

	No-one would imagine that a dog or an ape behaves in such a way as
to maximize a discounted sum of rewards over the course of its life.
Instead, the animal is regarded as having various drives and when one of
these drives is excited the animal attempts to satisfy it.  If it succeeds,
it enters some residual motivational state that results mainly in lying
about until the next episode of stronger motivation occurs.  The picture
is somewhat complicated by the existence of simple S-R behavior and
by some priorities among drives so that an effort to satisfy one drive
can be interrupted either temporarily or permanently by the need to
satisfy a higher priority drive.

	Before proceding further we shall make a digression on the relation
between S-R description of behavior and descriptions in terms of goals.
The simplest S-R behavior gives a relation between input and output and
doesn't take internal state into account.  The next more complicated behavior
allows internal state to determine and S-R relation.  Still more complicated
is the automaton behavior description in which the stimulus changes the
internal state and the output depends on both the input and the internal
state.  If we don't limit the number of internal states and we allow arbitrary
kinds of presentations of the laws of state succession, then any behavior
admits automaton description.  However, the automaton mode of description
may not be a reasonable way of expressing what we actually know about the
behavior.  Take the example of a sheep dog in an obedience trial that must
herd some sheep into a pen.  We usually won't know the internal states of
the dog or even the rules that determine which sheep the dog will run
after next or when it will bark.  Instead our knowledge can be described
by saying that the dog will assume positions relative to the sheep and will
bark in such a way as will drive the sheep into the pen assuming that the
sheep behave according to the expectations generated by the dog's
previous experience about which we may know little.  In this case, our
knowledge of the dog's behavior can only be conveniently summarized by
saying that the dog is pursuing the goal of driving the sheep into the
pen by running around and barking using his eyes, ears and nose to detect
the location of the sheep.  While there are some cases in which either
S-R descriptions or descriptions in terms of goals express our knowledge
equally well, it is very common that goal descriptions work well where
S-R descriptions are hopeless without sneaking the goal back into the
rule that gives the response.  The nature of evolution, of learning from
experience, and of conscious design makes this inevitable.  As an example
of conscious design, consider that AT&T is always changing the telephone
system by putting in satellite links and electronic exchanges etc., but
from the user's or even the operator's point of view, the automatic
mechanism is well described as one that finds a path from the caller's
telephone to that of the called party.

	Animal behavior involves at least the following complications:

		a. It is often described by a combination of S-R and
goal seeking terms, e.g. when the dog sees a rabbit, it tries to catch
it, or when the wasp has found a larvae in which  to lay its eggs it
tries to find a place to bury it.

		b. Even in animals, goals give rise to subgoals.

		c. Animals can be trained to have goals by setting
them up as subgoals for existing goals, but the learned subgoals can
become independent of the goals that originally gave rise to them.
To make a rather strong claim, it seems that a dog can acquire the
general goal of obeying its master, i.e. figuring out what he wants
done and doing it - within the dog's limited ability to ascribe
goals to its master.


	Returning to human behavior, from the perspective of our evolution,
it seems very far-fetched to imagine that a human behaves so as to opitimize
a sum of benefits over his life.  Indeed, it seems remarkable that humans
have any long range goals at all, i.e. goals whose achievement takes years
or even extend beyond the lifetime of the individual.  Indeed many primitive
societies seem to operate on a day to day basis without long range goals,
and some back-to-nature ideologists have formed the long range goal of
inducing people to act in the here-and-now without long range goals.

	With this perspective, let us try to formulate a general notion of
human motivational structure.

	1. Some human behavior is S-R, and some can be described conveniently
in automaton form without goals.  In my present opinion, the simple S-R
case describes most of what is describable by automata at all.

	2. The next level goal is a built in predicate or evaluation function
of situations.  Thus people behave so as to reduce thirst or hunger.

	3. Next we have subgoals that remain subordinate to the main goal.
Thus a person may buy food in order to eat, but if he is offered a free
meal or it becomes clear that he will not get to eat the food he buys, he
will abandon the subgoal of buying the food.

	4. Next we have goals that were originally subgoals but which have
escaped the original main goals.  The goal of not hurting people's feelings
may have originally been a subgoal of averting some kinds of adverse reactions,
but in most adults or older children has assumed an independent form.

	5. Next we have goals that have been generated by goal-generating
goals.  Thus most people wish to have goals, adopt them according to
certain criteria of what constitutes an acceptable goal, but once adopted,
they have an independent existence.
From the point of view of mathematical logic, a goal to have goals is a
higher order entity, i.e. if a goal is a predicate or evaluation function
on situations, then a higher order goal is a predicate or evaluation
function on such functions with lower order arguments as well.

	When we impute a motivational structure to a person and try to
use it to predict his behavior, we have to take into account that goals
may conflict.  The simplest model of this behavior would be one that
assigns weights to the goals in order to generate an over-all evaluation
function and to suppose that the behavior will be such as to optimize
this over-all function.  In a mathematical sense, such an over-all
function may exist, but our knowledge of other people's behavior or even
of our own behavior does not take the form of knowledge of such a function.
Rather we know some priorities and to a lesser extent weightings of goals,
but in the less common or more complicated conflicts, it is difficult
to decide what the person will do, partly because the conflict will induce
higher order reasoning about the motivational structure itself.

	There is a further complication in describing human motivational
behavior.  A person's structure of ideas is most conveniently described
in symbolic form, e.g. as collection of sentences in some language or as
some kind of net.  Moreover, many of the ways in which new ideas arise
from old ideas and from observation resemble the operations of mathematical
logic.  Indeed, we advocate basing our formulation of people's goal
structures and our design of robots substantially or even mainly on
a logical formulation.  However, the chemical signalling system of
animal life is even older than the nervous system.